NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

SELECTFORMER: PRIVATE AND PRACTICAL DATA SELECTION FOR TRANSFORMERS

Ouyang, Xu; Lin, Felix Xiaozhu; Ji, Yangfeng (April 2025, ICLR 2025)

Free, publicly-accessible full text available April 24, 2026
Addressing Both Statistical and Causal Gender Fairness in NLP Models

Chen, Hannah; Ji, Yangfeng; Evans, David (June 2024, Findings of NAACL)

Full Text Available
Improve Temporal Awareness of LLMs for Sequential Recommendation

Chu, Zhendong; Wang, Zichao; Zhang, Ruiyi; Ji, Yangfeng; Wang, Hongning; Sun, Tong (June 2024, 1st ICML Workshop on In-Context Learning at ICML 2024)

Full Text Available
Improving Interpretability via Explicit Word Interaction Graph Layer

https://doi.org/10.1609/aaai.v37i11.26586

Sekhon, Arshdeep; Chen, Hanjie; Shrivastava, Aman; Wang, Zhe; Ji, Yangfeng; Qi, Yanjun (June 2023, Proceedings of the AAAI Conference on Artificial Intelligence)

Recent NLP literature has seen growing interest in improving model interpretability. Along this direction, we propose a trainable neural network layer that learns a global interaction graph between words and then selects more informative words using the learned word interactions. Our layer, we call WIGRAPH, can plug into any neural network-based NLP text classifiers right after its word embedding layer. Across multiple SOTA NLP models and various NLP datasets, we demonstrate that adding the WIGRAPH layer substantially improves NLP models' interpretability and enhances models' prediction performance at the same time.
more » « less
Full Text Available
Finding Friends and Flipping Frenemies: Automatic Paraphrase Dataset Augmentation Using Graph Theory

Chen, Hannah; Ji, Yangfeng; Evans, David (January 2020, Findings of ACL: Empirical Methods in Natural Language Processing)

Most NLP datasets are manually labeled, so suffer from inconsistent labeling or limited size. We propose methods for automatically improving datasets by viewing them as graphs with expected semantic properties. We construct a paraphrase graph from the provided sentence pair labels, and create an augmented dataset by directly inferring labels from the original sentence pairs using a transitivity property. We use structural balance theory to identify likely mislabelings in the graph, and flip their labels. We evaluate our methods on paraphrase models trained using these datasets starting from a pretrained BERT model, and find that the automatically-enhanced training sets result in more accurate models.
more » « less
Full Text Available

Search for: All records